Graph Representation and Anonymization in Large Survey Rating Data

نویسندگان

  • Xiaoxun Sun
  • Min Li
چکیده

We study the challenges of protecting privacy of individuals in the large public survey rating data in this chapter. Recent study shows that personal information in supposedly anonymous movie rating records is de-identified. The survey rating data usually contains both ratings of sensitive and non-sensitive issues. The ratings of sensitive issues involve personal privacy. Even though the survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. None of the existing anonymisation principles can effectively prevent such breaches in large survey rating data sets. We tackle the problem by defining a principle called (k, ε)-anonymity model to protect privacy. Intuitively, the principle requires that, for each transaction t in the given survey rating data T, at least (k − 1) other transactions in T must have ratings similar to t, where the similarity is controlled by ε. The (k, ε)-anonymity model is formulated by its graphical representation and a specific graph-anonymisation problem is studied by adopting graph modification with graph theory. Various cases are analyzed and methods are developed to make the updated graph meet (k, ε) requirements. The methods are applied to two real-life data sets to demonstrate their efficiency and practical utility.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Privacy-Preserving Data Analysis on Graphs and Social Networks

While literature within the field of privacy-preserving data mining (PPDM) has been around for many years, attention has mostly been given to the perturbation and anonymization of tabular data; understanding the role of privacy over graphs and networks is still very much in its infancy. In this chapter, we survey a very recent body of research on privacy-preserving data analysis over graphs and...

متن کامل

Efficient Edge Anonymization of Large Social Graphs

Edges in a social graph may represent private information that needs to be protected. Due to their graph partition schemes, existing edge anonymization methods have several drawbacks, such as high utility loss and high computational overhead. In this paper, we present a new edge anonymization method, which partitions a graph using a new vertex equivalence relation called the neighbor-set equiva...

متن کامل

An Iterative Algorithm for Graph De-anonymization

The availability of social network data is indispensable for numerous types of research. Nevertheless, data owners are often reluctant to release social network data, as the release may reveal the private information of the individuals involved in the data. To address this problem, several techniques have been proposed to anonymize social networks for privacy preserving publications. To evaluat...

متن کامل

How to Quantify Graph De-anonymization Risks

An increasing amount of data are becoming publicly available over the Internet. These data are released after applying some anonymization techniques. Recently, researchers have paid significant attention to analyzing the risks of publishing privacy-sensitive data. Even if data anonymization techniques were applied to protect privacy-sensitive data, several de-anonymization attacks have been pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011